Skip to content

Conversation

@joao-r-reis
Copy link
Contributor

@joao-r-reis joao-r-reis commented May 29, 2024

JIRA

Motivation

In my experience troubleshooting connection issues in gocql is quite complicated because there's not enough logging information available to understand what is going on under the hood (control connection reconnections, nodes being considered UP/DOWN, nodes being added or removed, connection pool errors, etc.). There is already some logging for this in place at the moment but it requires a build flag to be set which is far from ideal.

Adding logging for all of this by default would make the default gocql experience a bit frustrating for users that do not care about all of this extra information so log levels are also needed. With log levels, one can make the logging more verbose if they are trying to troubleshoot a particular issue. Other C* drivers already provide this kind of functionality out of the box.

Proposal

As mentioned in the prior section, with this PR I'm adding logging for a lot of the scenarios that were described above and I'm also adding log levels so that log verbosity can be tuned by users.

On top of the additional logging and log levels, I'm also adding better support for structured logging (new StructuredLogger interface and field in the cluster config). Note that the current "printf" way needs to be supported at the very least until gocql v2 to maintain the same behavior that exists today in apps that are leveraging the StdLogger interface.

I'm proposing to set the default log level as WARNING because it looks like WARNING is the most appropriate default level taking into account what logging there is today in gocql by default (prior to this proposal).

It would be great if other community members could chime in on this proposal to improve any weak points that it might have and to make sure it addresses the needs of our users.

Usage/Examples

Example application side code with zerolog:

// write a new logger that implements gocql's StructuredLogger interface using zerolog
type Zerologlogger struct {
	L *zerolog.Logger
}

func (rec *Zerologlogger) log(event *zerolog.Event, fields ...gocql.LogField) *zerolog.Event {
	for _, field := range fields {
		event = event.Any(field.Name, field.Value)
	}
	return event
}

func (rec *Zerologlogger) Error(msg string, fields ...gocql.LogField) {
	rec.log(rec.L.Error(), fields...).Msg(msg)
}

func (rec *Zerologlogger) Warning(msg string, fields ...gocql.LogField) {
	rec.log(rec.L.Warn(), fields...).Msg(msg)
}

func (rec *Zerologlogger) Info(msg string, fields ...gocql.LogField) {
	rec.log(rec.L.Info(), fields...).Msg(msg)
}

func (rec *Zerologlogger) Debug(msg string, fields ...gocql.LogField) {
	rec.log(rec.L.Debug(), fields...).Msg(msg)
}

// example of building a session with this new logger
func buildCqlSession(logger *zerolog.Logger) (*gocql.Session, error) {
	cluster := gocql.NewCluster("127.0.0.1")
	cluster.StructuredLogger = &Zerologlogger{L: logger}
	return cluster.CreateSession()
}

Alternatives

I considered log/slog instead of creating a new Structured Logger interface from scratch but log/slog was only introduced in go 1.21 so I discarded this option for now. Maybe it would be a good idea to move to log/slog down the line when a new major version of gocql is being worked on?

Drawbacks

The biggest drawback in this solution is that in order for a user to take full advantage of the new StructuredLogger interface, they have to implement a new type while this wasn't needed in the case of StdLogger for some logging libraries.

I've added 2 new "extension" modules to provide an out of the box implementation of this new logger interface with zerolog and zap but I'm not sure if this would be the best approach to provide this functionality.

Log output examples

The following examples were collected following one simple procedure with CCM where:

  1. node1 (127.0.0.1) is brought DOWN
  2. node1 (127.0.0.1) is brought UP
  3. node3 (127.0.0.3) is brought DOWN
  4. node3 (127.0.0.3) is REMOVED
  5. node3 (127.0.0.1) is ADDED and brought UP

Each log sample was collected on a unique run of this procedure so these log samples might not be fully comparable with each other.

Zerolog with gocql minimum log level set to WARN

{"level":"warn","host_addr":"127.0.0.1","host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","err":"read tcp 127.0.0.1:61968->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:57:43+01:00"
,"message":"gocql: connection pool filling failed %s (%s): %v\n"}
{"level":"warn","host_addr":"127.0.0.1","port":9042,"time":"2024-05-29T10:57:43+01:00","message":"gocql: node is DOWN: %s:%d\n"}
{"level":"warn","host_addr":"127.0.0.1","port":9042,"time":"2024-05-29T10:58:03+01:00","message":"gocql: node is DOWN: %s:%d\n"}
{"level":"warn","host_addr":"127.0.0.3","host_id":"83389975-1aef-4727-a2e1-9451691dd9bb","err":"read tcp 127.0.0.1:62078->127.0.0.3:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:58:40+01:00"
,"message":"gocql: connection pool filling failed %s (%s): %v\n"}
{"level":"warn","host_addr":"127.0.0.3","port":9042,"time":"2024-05-29T10:58:41+01:00","message":"gocql: node is DOWN: %s:%d\n"}
{"level":"warn","host_addr":"127.0.0.3","port":9042,"time":"2024-05-29T10:59:00+01:00","message":"gocql: node is DOWN: %s:%d\n"}
{"level":"warn","host_addr":"127.0.0.3","host_id":"83389975-1aef-4727-a2e1-9451691dd9bb","err":"dial tcp 127.0.0.3:9042: connectex: No connection could be made because the target machine actively refused it.","time":"2024-05-29T10:59:35+01:00","
message":"gocql: connection pool filling failed %s (%s): %v\n"}
{"level":"warn","host_addr":"127.0.0.3","port":9042,"time":"2024-05-29T10:59:35+01:00","message":"gocql: node is DOWN: %s:%d\n"}
{"level":"warn","frame":"REMOVED_NODE->127.0.0.3:9042","time":"2024-05-29T10:59:43+01:00","message":"gocql: received topology change event: %v"}
{"level":"warn","host_addr":"127.0.0.3","host_id":"83389975-1aef-4727-a2e1-9451691dd9bb","time":"2024-05-29T10:59:44+01:00","message":"gocql: removing host %v (%v)."}
{"level":"warn","frame":"NEW_NODE->127.0.0.3:9042","time":"2024-05-29T11:01:56+01:00","message":"gocql: received topology change event: %v"}

Default logger with gocql minimum log level set to WARN

2024/05/29 11:04:59 gocql: connection pool filling failed 127.0.0.1 (d0586c1b-9546-429a-81a6-278a6ffe903f): dial tcp 127.0.0.1:9042: connectex: No connection could be made because the target machine actively refused it.
2024/05/29 11:04:59 gocql: node is DOWN: 127.0.0.1:9042
2024/05/29 11:05:16 gocql: node is DOWN: 127.0.0.1:9042
2024/05/29 11:06:12 gocql: connection pool filling failed 127.0.0.3 (789de5e6-fc79-4066-aba9-0fb8f4cea344): read tcp 127.0.0.1:62444->127.0.0.3:9042: wsarecv: An existing connection was forcibly closed by the remote host.
2024/05/29 11:06:13 gocql: node is DOWN: 127.0.0.3:9042
2024/05/29 11:06:32 gocql: node is DOWN: 127.0.0.3:9042
2024/05/29 11:06:43 gocql: connection pool filling failed 127.0.0.3 (789de5e6-fc79-4066-aba9-0fb8f4cea344): dial tcp 127.0.0.3:9042: connectex: No connection could be made because the target machine actively refused it.
2024/05/29 11:06:43 gocql: node is DOWN: 127.0.0.3:9042
2024/05/29 11:07:23 gocql: received topology change event: REMOVED_NODE->127.0.0.3:9042
2024/05/29 11:07:24 gocql: removing host 127.0.0.3 (789de5e6-fc79-4066-aba9-0fb8f4cea344).
2024/05/29 11:09:26 gocql: received topology change event: NEW_NODE->127.0.0.3:9042

Zerolog with gocql minimum log level set to DEBUG

{"level":"debug","protocol_version":4,"host_addr":"127.0.0.1","host_id":"","time":"2024-05-29T10:48:10+01:00","message":"gocql: discovered protocol version %v using host %v (%s)."}
{"level":"info","protocol_version":4,"time":"2024-05-29T10:48:10+01:00","message":"gocql: discovered protocol version %v."}                                                             
{"level":"info","host_addr":"127.0.0.1","host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","time":"2024-05-29T10:48:10+01:00","message":"gocql: adding host %v (%v)."}                    
{"level":"info","host_addr":"127.0.0.1","host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","time":"2024-05-29T10:48:10+01:00","message":"gocql: control connection connected to %v (%s)."}
{"level":"info","ring":"[127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-60f32865-2f22-47fa-b3f4-034aea1b2bcc:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP]","time":"2024-05-29T10:48:10+01:00","message":"gocql: refreshed rin
g: %v."}
{"level":"info","host_addr":"127.0.0.3","host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","time":"2024-05-29T10:48:10+01:00","message":"gocql: adding host %v (%v)."}
{"level":"info","host_addr":"127.0.0.2","host_id":"b1d368d0-d545-4213-843f-1e0229be11c4","time":"2024-05-29T10:48:10+01:00","message":"gocql: adding host %v (%v)."}
{"level":"debug","host_addr":"127.0.0.2","port":9042,"host_id":"b1d368d0-d545-4213-843f-1e0229be11c4","time":"2024-05-29T10:48:10+01:00","message":"gocql: connected to node: %s:%d (%s)\n"}
{"level":"debug","host_addr":"127.0.0.1","port":9042,"host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","time":"2024-05-29T10:48:10+01:00","message":"gocql: connected to node: %s:%d (%s)\n"}
{"level":"info","time":"2024-05-29T10:48:10+01:00","message":"gocql: Session initialized successfully."}
{"level":"debug","host_addr":"127.0.0.3","port":9042,"host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","time":"2024-05-29T10:48:10+01:00","message":"gocql: connected to node: %s:%d (%s)\n"}
{"level":"debug","host_addr":"127.0.0.2","host_id":"b1d368d0-d545-4213-843f-1e0229be11c4","count":2,"time":"2024-05-29T10:48:10+01:00","message":"gocql: conns of pool after stopped %s (%s): %v\n"}
{"level":"debug","host_addr":"127.0.0.1","host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","count":2,"time":"2024-05-29T10:48:10+01:00","message":"gocql: conns of pool after stopped %s (%s): %v\n"}
{"level":"debug","host_addr":"127.0.0.3","host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","count":2,"time":"2024-05-29T10:48:10+01:00","message":"gocql: conns of pool after stopped %s (%s): %v\n"}
{"level":"info","host_addr":"127.0.0.1","host_id":"","err":"read tcp 127.0.0.1:61292->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:48:48+01:00","message":"gocql: control connectio
n error %v (%s): %v\n"}
{"level":"info","time":"2024-05-29T10:48:48+01:00","message":"gocql: reconnecting the control connection."}
{"level":"info","addr":"127.0.0.1:9042","err":"read tcp 127.0.0.1:61297->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:48:48+01:00","message":"gocql: pool connection error %v: %v\n
"}
{"level":"info","addr":"127.0.0.1:9042","err":"read tcp 127.0.0.1:61293->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:48:48+01:00","message":"gocql: pool connection error %v: %v\n
"}
{"level":"info","host_addr":"127.0.0.1","port":9042,"host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","err":"read tcp 127.0.0.1:61305->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:
48:48+01:00","message":"gocql: unable to dial control conn %s:%v (%s): %v\n"}
{"level":"debug","host_addr":"127.0.0.1","host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","err":"read tcp 127.0.0.1:61306->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:48:48+01:00
","message":"gocql: unable to dial %s (%s): %v\n"}
{"level":"warn","host_addr":"127.0.0.1","host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","err":"read tcp 127.0.0.1:61306->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:48:48+01:00"
,"message":"gocql: connection pool filling failed %s (%s): %v\n"}
{"level":"info","host_addr":"127.0.0.3","host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","time":"2024-05-29T10:48:48+01:00","message":"gocql: control connection connected to %v (%s)."}
{"level":"info","ring":"[127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-60f32865-2f22-47fa-b3f4-034aea1b2bcc:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP]","time":"2024-05-29T10:48:48+01:00","message":"gocql: refreshed rin
g: %v."}
{"level":"debug","host_addr":"127.0.0.1","host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","count":0,"time":"2024-05-29T10:48:48+01:00","message":"gocql: conns of pool after stopped %s (%s): %v\n"}
{"level":"warn","host_addr":"127.0.0.1","port":9042,"time":"2024-05-29T10:48:48+01:00","message":"gocql: node is DOWN: %s:%d\n"}
{"level":"debug","frame":"[status_change change=DOWN host=127.0.0.1 port=9042]","time":"2024-05-29T10:49:07+01:00","message":"gocql: handling event frame: %v\n"}
{"level":"info","frame":"DOWN->127.0.0.1:9042","time":"2024-05-29T10:49:08+01:00","message":"gocql: dispatching status change event: %v\n"}
{"level":"warn","host_addr":"127.0.0.1","port":9042,"time":"2024-05-29T10:49:08+01:00","message":"gocql: node is DOWN: %s:%d\n"}
{"level":"debug","time":"2024-05-29T10:49:10+01:00","message":"gocql: connecting to downed hosts if there is any."}
{"level":"debug","ring":"[127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:DOWN][127.0.0.3-60f32865-2f22-47fa-b3f4-034aea1b2bcc:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP]","time":"2024-05-29T10:49:10+01:00","message":"gocql: current ri
ng: %v."}
{"level":"info","host_addr":"127.0.0.1","host_port":9042,"host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","time":"2024-05-29T10:49:10+01:00","message":"gocql: reconnecting to downed host %v:%d (%v)."}
{"level":"debug","host_addr":"127.0.0.1","host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","err":"dial tcp 127.0.0.1:9042: connectex: No connection could be made because the target machine actively refused it.","time":"2024-05-29T10:49:12+01:00",
"message":"gocql: unable to dial %s (%s): %v\n"}
{"level":"warn","host_addr":"127.0.0.1","host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","err":"dial tcp 127.0.0.1:9042: connectex: No connection could be made because the target machine actively refused it.","time":"2024-05-29T10:49:12+01:00","
message":"gocql: connection pool filling failed %s (%s): %v\n"}
{"level":"debug","host_addr":"127.0.0.1","host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","count":0,"time":"2024-05-29T10:49:12+01:00","message":"gocql: conns of pool after stopped %s (%s): %v\n"}
{"level":"warn","host_addr":"127.0.0.1","port":9042,"time":"2024-05-29T10:49:12+01:00","message":"gocql: node is DOWN: %s:%d\n"}
{"level":"debug","frame":"[status_change change=UP host=127.0.0.1 port=9042]","time":"2024-05-29T10:49:37+01:00","message":"gocql: handling event frame: %v\n"}
{"level":"info","frame":"UP->127.0.0.1:9042","time":"2024-05-29T10:49:38+01:00","message":"gocql: dispatching status change event: %v\n"}
{"level":"info","event_ip":"127.0.0.1","event_port":9042,"time":"2024-05-29T10:49:38+01:00","message":"gocql: node is UP: %s:%d\n"}
{"level":"debug","host_addr":"127.0.0.1","port":9042,"host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","time":"2024-05-29T10:49:38+01:00","message":"gocql: connected to node: %s:%d (%s)\n"}
{"level":"debug","host_addr":"127.0.0.1","host_id":"d0586c1b-9546-429a-81a6-278a6ffe903f","count":2,"time":"2024-05-29T10:49:38+01:00","message":"gocql: conns of pool after stopped %s (%s): %v\n"}
{"level":"info","addr":"127.0.0.3:9042","err":"read tcp 127.0.0.1:61298->127.0.0.3:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:50:02+01:00","message":"gocql: pool connection error %v: %v\n
"}
{"level":"info","host_addr":"127.0.0.3","host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","err":"read tcp 127.0.0.1:61307->127.0.0.3:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:50:02+01:00"
,"message":"gocql: control connection error %v (%s): %v\n"}
{"level":"info","time":"2024-05-29T10:50:02+01:00","message":"gocql: reconnecting the control connection."}
{"level":"info","addr":"127.0.0.3:9042","err":"read tcp 127.0.0.1:61295->127.0.0.3:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:50:02+01:00","message":"gocql: pool connection error %v: %v\n
"}
{"level":"info","host_addr":"127.0.0.3","port":9042,"host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","err":"read tcp 127.0.0.1:61464->127.0.0.3:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:
50:02+01:00","message":"gocql: unable to dial control conn %s:%v (%s): %v\n"}
{"level":"debug","host_addr":"127.0.0.3","host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","err":"read tcp 127.0.0.1:61465->127.0.0.3:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:50:02+01:00
","message":"gocql: unable to dial %s (%s): %v\n"}
{"level":"warn","host_addr":"127.0.0.3","host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","err":"read tcp 127.0.0.1:61465->127.0.0.3:9042: wsarecv: An existing connection was forcibly closed by the remote host.","time":"2024-05-29T10:50:02+01:00"
,"message":"gocql: connection pool filling failed %s (%s): %v\n"}
{"level":"info","host_addr":"127.0.0.2","host_id":"b1d368d0-d545-4213-843f-1e0229be11c4","time":"2024-05-29T10:50:02+01:00","message":"gocql: control connection connected to %v (%s)."}
{"level":"info","ring":"[127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-60f32865-2f22-47fa-b3f4-034aea1b2bcc:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP]","time":"2024-05-29T10:50:02+01:00","message":"gocql: refreshed rin
g: %v."}
{"level":"debug","host_addr":"127.0.0.3","host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","count":0,"time":"2024-05-29T10:50:02+01:00","message":"gocql: conns of pool after stopped %s (%s): %v\n"}
{"level":"warn","host_addr":"127.0.0.3","port":9042,"time":"2024-05-29T10:50:02+01:00","message":"gocql: node is DOWN: %s:%d\n"}
{"level":"debug","time":"2024-05-29T10:50:10+01:00","message":"gocql: connecting to downed hosts if there is any."}
{"level":"debug","ring":"[127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-60f32865-2f22-47fa-b3f4-034aea1b2bcc:DOWN][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP]","time":"2024-05-29T10:50:10+01:00","message":"gocql: current ri
ng: %v."}
{"level":"info","host_addr":"127.0.0.3","host_port":9042,"host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","time":"2024-05-29T10:50:10+01:00","message":"gocql: reconnecting to downed host %v:%d (%v)."}
{"level":"debug","host_addr":"127.0.0.3","host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","err":"dial tcp 127.0.0.3:9042: connectex: No connection could be made because the target machine actively refused it.","time":"2024-05-29T10:50:12+01:00",
"message":"gocql: unable to dial %s (%s): %v\n"}
{"level":"warn","host_addr":"127.0.0.3","host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","err":"dial tcp 127.0.0.3:9042: connectex: No connection could be made because the target machine actively refused it.","time":"2024-05-29T10:50:12+01:00","
message":"gocql: connection pool filling failed %s (%s): %v\n"}
{"level":"debug","host_addr":"127.0.0.3","host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","count":0,"time":"2024-05-29T10:50:12+01:00","message":"gocql: conns of pool after stopped %s (%s): %v\n"}
{"level":"warn","host_addr":"127.0.0.3","port":9042,"time":"2024-05-29T10:50:12+01:00","message":"gocql: node is DOWN: %s:%d\n"}
{"level":"debug","frame":"[status_change change=DOWN host=127.0.0.3 port=9042]","time":"2024-05-29T10:50:20+01:00","message":"gocql: handling event frame: %v\n"}
{"level":"info","frame":"DOWN->127.0.0.3:9042","time":"2024-05-29T10:50:21+01:00","message":"gocql: dispatching status change event: %v\n"}
{"level":"warn","host_addr":"127.0.0.3","port":9042,"time":"2024-05-29T10:50:21+01:00","message":"gocql: node is DOWN: %s:%d\n"}
{"level":"debug","frame":"[topology_change change=REMOVED_NODE host=127.0.0.3 port=9042]","time":"2024-05-29T10:51:03+01:00","message":"gocql: handling event frame: %v\n"}
{"level":"warn","frame":"REMOVED_NODE->127.0.0.3:9042","time":"2024-05-29T10:51:04+01:00","message":"gocql: received topology change event: %v"}
{"level":"warn","host_addr":"127.0.0.3","host_id":"60f32865-2f22-47fa-b3f4-034aea1b2bcc","time":"2024-05-29T10:51:05+01:00","message":"gocql: removing host %v (%v)."}
{"level":"info","ring":"[127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP]","time":"2024-05-29T10:51:05+01:00","message":"gocql: refreshed ring: %v."}
{"level":"debug","time":"2024-05-29T10:51:10+01:00","message":"gocql: connecting to downed hosts if there is any."}
{"level":"debug","ring":"[127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP]","time":"2024-05-29T10:51:10+01:00","message":"gocql: current ring: %v."}
{"level":"debug","time":"2024-05-29T10:52:10+01:00","message":"gocql: connecting to downed hosts if there is any."}
{"level":"debug","ring":"[127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP][127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP]","time":"2024-05-29T10:52:10+01:00","message":"gocql: current ring: %v."}
{"level":"debug","frame":"[topology_change change=NEW_NODE host=127.0.0.3 port=9042]","time":"2024-05-29T10:52:54+01:00","message":"gocql: handling event frame: %v\n"}
{"level":"debug","frame":"[status_change change=UP host=127.0.0.3 port=9042]","time":"2024-05-29T10:52:54+01:00","message":"gocql: handling event frame: %v\n"}
{"level":"warn","frame":"NEW_NODE->127.0.0.3:9042","time":"2024-05-29T10:52:55+01:00","message":"gocql: received topology change event: %v"}
{"level":"info","frame":"UP->127.0.0.3:9042","time":"2024-05-29T10:52:55+01:00","message":"gocql: dispatching status change event: %v\n"}
{"level":"info","event_ip":"127.0.0.3","event_port":9042,"time":"2024-05-29T10:52:55+01:00","message":"gocql: node is UP: %s:%d\n"}
{"level":"info","host_addr":"127.0.0.3","host_id":"83389975-1aef-4727-a2e1-9451691dd9bb","time":"2024-05-29T10:52:56+01:00","message":"gocql: adding host %v (%v)."}
{"level":"info","ring":"[127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-83389975-1aef-4727-a2e1-9451691dd9bb:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP]","time":"2024-05-29T10:52:56+01:00","message":"gocql: refreshed rin
g: %v."}
{"level":"debug","host_addr":"127.0.0.3","port":9042,"host_id":"83389975-1aef-4727-a2e1-9451691dd9bb","time":"2024-05-29T10:52:56+01:00","message":"gocql: connected to node: %s:%d (%s)\n"}
{"level":"debug","host_addr":"127.0.0.3","host_id":"83389975-1aef-4727-a2e1-9451691dd9bb","count":2,"time":"2024-05-29T10:52:56+01:00","message":"gocql: conns of pool after stopped %s (%s): %v\n"}
{"level":"debug","time":"2024-05-29T10:53:10+01:00","message":"gocql: connecting to downed hosts if there is any."}
{"level":"debug","ring":"[127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-83389975-1aef-4727-a2e1-9451691dd9bb:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP]","time":"2024-05-29T10:53:10+01:00","message":"gocql: current ring
: %v."}

Default logger with gocql minimum log level set to DEBUG

2024/05/29 10:40:16 gocql: discovered protocol version 4 using host 127.0.0.2 ().
2024/05/29 10:40:16 gocql: discovered protocol version 4.                                                                                                                                            
2024/05/29 10:40:16 gocql: adding host 127.0.0.1 (d0586c1b-9546-429a-81a6-278a6ffe903f).                                                                                                             
2024/05/29 10:40:16 gocql: control connection connected to 127.0.0.1 (d0586c1b-9546-429a-81a6-278a6ffe903f).                                                                                         
2024/05/29 10:40:16 gocql: refreshed ring: [127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-921d79a6-4791-4cc3-9426-d5a85cd1cb24:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP].
2024/05/29 10:40:16 gocql: adding host 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24).                                                                                                             
2024/05/29 10:40:16 gocql: adding host 127.0.0.2 (b1d368d0-d545-4213-843f-1e0229be11c4).                                                                                                             
2024/05/29 10:40:16 gocql: connected to node: 127.0.0.1:9042 (d0586c1b-9546-429a-81a6-278a6ffe903f)       
2024/05/29 10:40:16 gocql: connected to node: 127.0.0.2:9042 (b1d368d0-d545-4213-843f-1e0229be11c4)       
2024/05/29 10:40:16 gocql: Session initialized successfully.                                              
2024/05/29 10:40:16 gocql: connected to node: 127.0.0.3:9042 (921d79a6-4791-4cc3-9426-d5a85cd1cb24)       
2024/05/29 10:40:16 gocql: conns of pool after stopped 127.0.0.1 (d0586c1b-9546-429a-81a6-278a6ffe903f): 2
2024/05/29 10:40:16 gocql: conns of pool after stopped 127.0.0.2 (b1d368d0-d545-4213-843f-1e0229be11c4): 2
2024/05/29 10:40:16 gocql: conns of pool after stopped 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): 2
2024/05/29 10:40:44 gocql: pool connection error 127.0.0.1:9042: read tcp 127.0.0.1:60927->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.
2024/05/29 10:40:44 gocql: unable to dial 127.0.0.1 (d0586c1b-9546-429a-81a6-278a6ffe903f): read tcp 127.0.0.1:60935->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.
2024/05/29 10:40:44 gocql: connection pool filling failed 127.0.0.1 (d0586c1b-9546-429a-81a6-278a6ffe903f): read tcp 127.0.0.1:60935->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.
2024/05/29 10:40:44 gocql: pool connection error 127.0.0.1:9042: read tcp 127.0.0.1:60930->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.
2024/05/29 10:40:44 gocql: control connection error 127.0.0.1 (): read tcp 127.0.0.1:60925->127.0.0.1:9042: wsarecv: An existing connection was forcibly closed by the remote host.
2024/05/29 10:40:44 gocql: reconnecting the control connection.
2024/05/29 10:40:44 gocql: conns of pool after stopped 127.0.0.1 (d0586c1b-9546-429a-81a6-278a6ffe903f): 0
2024/05/29 10:40:44 gocql: node is DOWN: 127.0.0.1:9042
2024/05/29 10:40:46 gocql: unable to dial control conn 127.0.0.1:9042 (d0586c1b-9546-429a-81a6-278a6ffe903f): dial tcp 127.0.0.1:9042: connectex: No connection could be made because the target machine actively refused it.
2024/05/29 10:40:46 gocql: control connection connected to 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24).
2024/05/29 10:40:46 gocql: refreshed ring: [127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:DOWN][127.0.0.3-921d79a6-4791-4cc3-9426-d5a85cd1cb24:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP].
2024/05/29 10:41:02 gocql: handling event frame: [status_change change=DOWN host=127.0.0.1 port=9042]
2024/05/29 10:41:03 gocql: dispatching status change event: DOWN->127.0.0.1:9042
2024/05/29 10:41:03 gocql: node is DOWN: 127.0.0.1:9042
2024/05/29 10:41:06 gocql: handling event frame: [status_change change=UP host=127.0.0.1 port=9042]
2024/05/29 10:41:07 gocql: dispatching status change event: UP->127.0.0.1:9042
2024/05/29 10:41:07 gocql: node is UP: 127.0.0.1:9042
2024/05/29 10:41:07 gocql: connected to node: 127.0.0.1:9042 (d0586c1b-9546-429a-81a6-278a6ffe903f)
2024/05/29 10:41:07 gocql: conns of pool after stopped 127.0.0.1 (d0586c1b-9546-429a-81a6-278a6ffe903f): 2
2024/05/29 10:41:16 gocql: connecting to downed hosts if there is any.
2024/05/29 10:41:16 gocql: current ring: [127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-921d79a6-4791-4cc3-9426-d5a85cd1cb24:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP].
2024/05/29 10:41:21 gocql: pool connection error 127.0.0.3:9042: read tcp 127.0.0.1:60928->127.0.0.3:9042: wsarecv: An existing connection was forcibly closed by the remote host.
2024/05/29 10:41:21 gocql: control connection error 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): read tcp 127.0.0.1:60939->127.0.0.3:9042: wsarecv: An existing connection was forcibly closed by the remote host.
2024/05/29 10:41:21 gocql: reconnecting the control connection.
2024/05/29 10:41:21 gocql: pool connection error 127.0.0.3:9042: read tcp 127.0.0.1:60931->127.0.0.3:9042: wsarecv: An existing connection was forcibly closed by the remote host.
2024/05/29 10:41:22 gocql: control connection heartbeat failed: gocql: connection closed waiting for response.
2024/05/29 10:41:23 gocql: unable to dial control conn 127.0.0.3:9042 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): dial tcp 127.0.0.3:9042: connectex: No connection could be made because the target machine actively refused it.
2024/05/29 10:41:23 gocql: unable to dial 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): dial tcp 127.0.0.3:9042: connectex: No connection could be made because the target machine actively refused it.
2024/05/29 10:41:23 gocql: connection pool filling failed 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): dial tcp 127.0.0.3:9042: connectex: No connection could be made because the target machine actively refused it.
2024/05/29 10:41:23 gocql: control connection connected to 127.0.0.2 (b1d368d0-d545-4213-843f-1e0229be11c4).
2024/05/29 10:41:23 gocql: refreshed ring: [127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-921d79a6-4791-4cc3-9426-d5a85cd1cb24:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP].
2024/05/29 10:41:23 gocql: conns of pool after stopped 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): 0
2024/05/29 10:41:23 gocql: node is DOWN: 127.0.0.3:9042
2024/05/29 10:41:40 gocql: handling event frame: [status_change change=DOWN host=127.0.0.3 port=9042]
2024/05/29 10:41:41 gocql: dispatching status change event: DOWN->127.0.0.3:9042
2024/05/29 10:41:41 gocql: node is DOWN: 127.0.0.3:9042
2024/05/29 10:42:16 gocql: connecting to downed hosts if there is any.
2024/05/29 10:42:16 gocql: current ring: [127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-921d79a6-4791-4cc3-9426-d5a85cd1cb24:DOWN][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP].
2024/05/29 10:42:16 gocql: reconnecting to downed host 127.0.0.3:9042 (921d79a6-4791-4cc3-9426-d5a85cd1cb24).
2024/05/29 10:42:18 gocql: unable to dial 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): dial tcp 127.0.0.3:9042: connectex: No connection could be made because the target machine actively refused it.
2024/05/29 10:42:18 gocql: connection pool filling failed 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): dial tcp 127.0.0.3:9042: connectex: No connection could be made because the target machine actively refused it.
2024/05/29 10:42:18 gocql: conns of pool after stopped 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): 0
2024/05/29 10:42:18 gocql: node is DOWN: 127.0.0.3:9042
2024/05/29 10:43:15 gocql: handling event frame: [topology_change change=REMOVED_NODE host=127.0.0.3 port=9042]
2024/05/29 10:43:16 gocql: received topology change event: REMOVED_NODE->127.0.0.3:9042
2024/05/29 10:43:16 gocql: connecting to downed hosts if there is any.
2024/05/29 10:43:16 gocql: current ring: [127.0.0.3-921d79a6-4791-4cc3-9426-d5a85cd1cb24:DOWN][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP][127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP].
2024/05/29 10:43:16 gocql: reconnecting to downed host 127.0.0.3:9042 (921d79a6-4791-4cc3-9426-d5a85cd1cb24).
2024/05/29 10:43:17 gocql: removing host 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24).
2024/05/29 10:43:17 gocql: refreshed ring: [127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP].
2024/05/29 10:43:18 gocql: unable to dial 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): dial tcp 127.0.0.3:9042: connectex: No connection could be made because the target machine actively refused it.
2024/05/29 10:43:18 gocql: connection pool filling failed 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): dial tcp 127.0.0.3:9042: connectex: No connection could be made because the target machine actively refused it.
2024/05/29 10:43:18 gocql: conns of pool after stopped 127.0.0.3 (921d79a6-4791-4cc3-9426-d5a85cd1cb24): 0
2024/05/29 10:43:18 gocql: node is DOWN: 127.0.0.3:9042
2024/05/29 10:44:16 gocql: connecting to downed hosts if there is any.
2024/05/29 10:44:16 gocql: current ring: [127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP].
2024/05/29 10:45:12 gocql: handling event frame: [topology_change change=NEW_NODE host=127.0.0.3 port=9042]
2024/05/29 10:45:12 gocql: handling event frame: [status_change change=UP host=127.0.0.3 port=9042]
2024/05/29 10:45:13 gocql: received topology change event: NEW_NODE->127.0.0.3:9042
2024/05/29 10:45:13 gocql: dispatching status change event: UP->127.0.0.3:9042
2024/05/29 10:45:13 gocql: node is UP: 127.0.0.3:9042
2024/05/29 10:45:14 gocql: adding host 127.0.0.3 (60f32865-2f22-47fa-b3f4-034aea1b2bcc).
2024/05/29 10:45:14 gocql: refreshed ring: [127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-60f32865-2f22-47fa-b3f4-034aea1b2bcc:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP].
2024/05/29 10:45:14 gocql: connected to node: 127.0.0.3:9042 (60f32865-2f22-47fa-b3f4-034aea1b2bcc)
2024/05/29 10:45:14 gocql: conns of pool after stopped 127.0.0.3 (60f32865-2f22-47fa-b3f4-034aea1b2bcc): 2
2024/05/29 10:45:16 gocql: connecting to downed hosts if there is any.
2024/05/29 10:45:16 gocql: current ring: [127.0.0.1-d0586c1b-9546-429a-81a6-278a6ffe903f:UP][127.0.0.3-60f32865-2f22-47fa-b3f4-034aea1b2bcc:UP][127.0.0.2-b1d368d0-d545-4213-843f-1e0229be11c4:UP].

@joao-r-reis joao-r-reis changed the title Add more logging, a Structured Logger interface and log levels CASSANDRA-19996 Add more logging, a Structured Logger interface and log levels Oct 11, 2024
@joao-r-reis joao-r-reis changed the title CASSANDRA-19996 Add more logging, a Structured Logger interface and log levels CASSGO-9 Add more logging, a Structured Logger interface and log levels Oct 29, 2024
@joao-r-reis
Copy link
Contributor Author

Maybe it is better to move the extension modules to packages instead, I've raised this in #1770 (comment)

@worryg0d
Copy link
Contributor

worryg0d commented Mar 21, 2025

Hello, @joao-r-reis!

I like the proposal. Debugging gocql is a real headache sometimes. Getting rid of the gocql_debug build tag is also a good idea to simplify the debugging process for end users.

The new API looks good to me. It lacks Trace() but I don't think we need this.
Having something like groups to log different driver components (conn, pool, queryExecutor) is probably a good idea, but it will make implementing the interface a bit harder, and some logging libraries may not provide this feature...

  func Group(fields ...string) StructuredLogger

Anyway, it can be handled on the driver side by just adding corresponding fields.

considered log/slog instead of creating a new Structured Logger interface from scratch but log/slog was only introduced in go 1.21 so I discarded this option for now.

Yeah, this is a shame. It would be cool if it depended on the existing logging interfaces as slog. We should actually decide the go version gocql will be bumped to.

@joao-r-reis
Copy link
Contributor Author

Having something like groups to log different driver components (conn, pool, queryExecutor) is probably a good idea, but it will make implementing the interface a bit harder, and some logging libraries may not provide this feature...

This is an interesting suggestion but personally hammering out the details of that would probably be a bit complex so I'd rather leave that out of scope of this PR. Maybe it's a good enhancement idea for the future?

@joao-r-reis
Copy link
Contributor Author

Maybe it is better to move the extension modules to packages instead, I've raised this in #1770 (comment)

I've done this justfyi

@jameshartig
Copy link
Contributor

I considered log/slog instead of creating a new Structured Logger interface from scratch but log/slog was only introduced in go 1.21 so I discarded this option for now.

I'm hesitant to change anything major without considering how it will work with log/slog. The downside is that right now we can't directly use slog.Attr. We could make an identical type for older Go and in newer go use a type alias to slog.Attr. That said, I think LogField is close and we could make a SlogLogger wrapper in the future. What're your thoughts about more closely aligning with slog?

"message":"gocql: received topology change event: %v"

What is the reason for keeping the printf verbs in the messages? Do we need to maintain backwards compatibility with old log messages?

@joao-r-reis
Copy link
Contributor Author

What is the reason for keeping the printf verbs in the messages? Do we need to maintain backwards compatibility with old log messages?

In this PR I kept the old logger printf based interface as deprecated and my reasoning was that this is something that is probably widely used and breaking it could mean users feel discouraged to upgrade to 2.0. I'm open to the idea of removing the old logger though...

I'm hesitant to change anything major without considering how it will work with log/slog. The downside is that right now we can't directly use slog.Attr. We could make an identical type for older Go and in newer go use a type alias to slog.Attr. That said, I think LogField is close and we could make a SlogLogger wrapper in the future. What're your thoughts about more closely aligning with slog?

My LogField implementation is basically the slog one with support for fewer things that we don't really need. I agree with the premise but what does it mean in practice? I think you can already pretty much implement a slog wrapper with what is on this PR right now. I thought about adding a slog extension similar to the zerolog and zap ones that I added but because these are packages it would require increasing the Go version on the entire module which isn't what we want. I very much agree that whatever we do here should make it possible to easily support slog integration and in the future even directly depend on slog when we increase the required Go version.

@joao-r-reis
Copy link
Contributor Author

I'm going to fix the conflicts today or tomorrow, this PR and the external type definitions PR are the two largest blocks that we have before we can move forward with the release

@jameshartig
Copy link
Contributor

In this PR I kept the old logger printf based interface as deprecated and my reasoning was that this is something that is probably widely used and breaking it could mean users feel discouraged to upgrade to 2.0.

I see what you mean but I'm concerned that if we switch logging and keep them in there then we're just continuing the dependence. I don't really see a great solution since we want to keep the existing logs unless we want to add 2 log calls to every spot where we log. Having the log message as "connection pool filling failed %s (%s): %v" in the new structured logs is less than ideal. Maybe we can just make it very clear that the messages will change and no one should rely on the messages being static?

@joao-r-reis
Copy link
Contributor Author

joao-r-reis commented Jun 9, 2025

I think I'll just replace the interface instead of adding a new one to use this opportunity that we're releasing 2.0. Users that rely on the current logger interface shouldn't find it too hard to adapt their implementation to the new interface

@joao-r-reis joao-r-reis force-pushed the logging branch 2 times, most recently from cf19dad to 1a7382f Compare June 10, 2025 19:17
@joao-r-reis joao-r-reis requested a review from jameshartig June 10, 2025 19:23
@joao-r-reis joao-r-reis changed the title CASSGO-9 Add more logging, a Structured Logger interface and log levels CASSGO-9 Add more logging and a Structured Logger interface with log levels Jun 12, 2025
The logger interface uses printf semantics and lacks the notion of log levels.
This commit changes (breaking change) the logger interface so printf semantics
are removed in favor of structured logging and adds log levels.

It provides 3 built in log implementations:
  - Default logger - uses log standard library
  - gocqlzap         - uses zap library
  - gocqlzerolog   - uses zerolog library

Users can use these implementations as is or they can implement their own
using these as examples.

Patch by João Reis; reviewed by James Hartig, Bohdan Siryk for CASSGO-9
@joao-r-reis joao-r-reis merged commit eda74b6 into apache:trunk Jun 17, 2025
72 checks passed
@joao-r-reis joao-r-reis deleted the logging branch June 17, 2025 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants